Toward Ontology-based Knowledge Extraction from Web Data with the Lexicalization of Ontology for Korean QA System

نویسندگان

  • Younggyun Hahm
  • Hee-Geun Yoon
  • Se-Young Park
  • Seong-Bae Park
  • Jungwon Cha
  • Dosam Hwang
  • Key-Sun Choi
چکیده

Most of knowledge is written in natural language and structured knowledge base includes the partially limited information of them. In QA system perspective, the quality of knowledge base is depends on how it covers the knowledge to answer user’s questions. To deal with this knowledge base construction problem, we define the natural language question sets and answer documents which contains knowledge enough to answer. And we manually generate triples in DBpedia schema to use as a test dataset. In this paper we present a framework for automatically structuring of knowledge from unstructured data and semi-structured data with pattern learning method to cover the knowledge to answer natural language question dataset. Our platform extracts entities and its relations in triple structure, and its schema is based on ontological knowledge representation. To achieve this goal, our pattern learning method is based on lexicalization of ontology. For unstructured data, we lexicalize ontological representation from natural language from its dependency structure. And for semi-structured data, the headers in table are also lexicalized. We have instantiated the system for Korean. To evaluate our system, we build gold standard triple datasets from natural language, and use F-measure and coverage measurement.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

Introduction: Autism is a nervous system disorder, and since there is no direct diagnosis for it, data mining can help diagnose the disease. Ontology as a backbone of the semantic web, a knowledge database with shareability and reusability, can be a confirmation of the correctness of disease diagnosis systems. This study aimed to provide a system for diagnosing autistic children with a combinat...

متن کامل

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

Introduction: Autism is a nervous system disorder, and since there is no direct diagnosis for it, data mining can help diagnose the disease. Ontology as a backbone of the semantic web, a knowledge database with shareability and reusability, can be a confirmation of the correctness of disease diagnosis systems. This study aimed to provide a system for diagnosing autistic children with a combinat...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

An Executive Approach Based On the Production of Fuzzy Ontology Using the Semantic Web Rule Language Method (SWRL)

Today, the need to deal with ambiguous information in semantic web languages is increasing. Ontology is an important part of the W3C standards for the semantic web, used to define a conceptual standard vocabulary for the exchange of data between systems, the provision of reusable databases, and the facilitation of collaboration across multiple systems. However, classical ontology is not enough ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014